A signal processing system for having the sound "pop-out" in noise thanks to the image of the speaker's lips: new advances using multi-layer perceptrons

نویسندگان

  • Laurent Girin
  • Laurent Varin
  • Gang Feng
  • Jean-Luc Schwartz
چکیده

This paper deals with the improvement of a noisy speech enhancement system based on the fusion of auditory and visual information. The system was presented in previous papers and implemented in the context of vowel to vowel and vowel to consonant transitions corrupted with white noise. Its principle consists in an analysis-enhancement-synthesis process based on a linear prediction (LP) model of the signal: the LP filter is enhanced thanks to associative tools that estimate LP cleaned parameters from both noisy audio and visual information. The detailed structure of the system is reminded and we focus on the improvement that concerns precisely the associators: basic neural networks (multi-layers perceptrons) are used instead of linear regression. It is shown that in the context of VCV transitions corrupted with white noise, neural networks can improve the performances of the system in terms of intelligibility gain, distance measures and classification tests.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Application of Multi-Layer Artificial Neural Networks in Speckle Reduction (Methodology)

Optical Coherence Tomography (OCT) uses the spatial and temporal coherence properties of optical waves backscattered from a tissue sample to form an image. An inherent characteristic of coherent imaging is the presence of speckle noise. In this study we use a new ensemble framework which is a combination of several Multi-Layer Perceptron (MLP) neural networks to denoise OCT images. The noise is...

متن کامل

A Convolutional Neural Network based on Adaptive Pooling for Classification of Noisy Images

Convolutional neural network is one of the effective methods for classifying images that performs learning using convolutional, pooling and fully-connected layers. All kinds of noise disrupt the operation of this network. Noise images reduce classification accuracy and increase convolutional neural network training time. Noise is an unwanted signal that destroys the original signal. Noise chang...

متن کامل

Grid Impedance Estimation Using Several Short-Term Low Power Signal Injections

In this paper, a signal processing method is proposed to estimate the low and high-frequency impedances of power systems using several short-term low power signal injections for a frequency range of 0-150 kHz. This frequency range is very important, and thusso it is considered in the analysis of power quality issues of smart grids. The impedance estimation is used in many power system applicati...

متن کامل

Optimum decoder for multiplicative spread spectrum image watermarking with Laplacian modeling

This paper investigates the multiplicative spread spectrum watermarking method for the image. The information bit is spreaded into middle-frequency Discrete Cosine Transform (DCT) coefficients of each block of an image using a generated pseudo-random sequence. Unlike the conventional signal modeling, we suppose that both signal and noise are distributed with Laplacian distribution, because the ...

متن کامل

An Automated MR Image Segmentation System Using Multi-layer Perceptron Neural Network

Background: Brain tissue segmentation for delineation of 3D anatomical structures from magnetic resonance (MR) images can be used for neuro-degenerative disorders, characterizing morphological differences between subjects based on volumetric analysis of gray matter (GM), white matter (WM) and cerebrospinal fluid (CSF), but only if the obtained segmentation results are correct. Due to image arti...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998